Data Analysts
2025-06-30
In this notebook, we perform an exploratory and descriptive analysis of the Travel Demand Dataset, using flight data accessed via the Amadeus API. The project aims to uncover travel trends to key global tech hubs—San Francisco, London, Bangalore, Singapore, and Tel Aviv—with the goal of informing strategic decisions for travel brands.
This analysis focuses on key variables such as flight price, number of stops, travel time, and available seats. Through data visualization and summary statistics, we explore underlying patterns and relationships in the data. This foundational work is critical for identifying trends, spotting potential biases, and guiding future modeling or policy-focused analysis.
We begin our analysis by importing core Python libraries including pandas, numpy, seaborn, matplotlib.pyplot, plotly.express, and os. These tools support efficient data manipulation, numerical analysis, visualization, and directory management. Warnings are suppressed to ensure a cleaner output during rendering.
To maintain a reproducible workflow, we define directory paths for storing raw data, processed files, results, and documentation. The cleaned dataset is loaded from the processed directory into a Pandas DataFrame for analysis.
The dataset contains 7,870 rows and 8 columns, covering both numerical and categorical flight-related variables such as:
Preprocessing includes converting the Departure Date column to datetime format and cleaning the Price (USD) field to numeric.
timedelta format and analyzed in hours for correlation and visualization purposes.This EDA phase sets the foundation for statistical testing by uncovering distributions, group differences, and potential relationships within the dataset.
This bar chart shows the average flight price to each tech hub.
A scatterplot showing the relationship between travel time and price across destinations.
A grouped bar chart comparing number of stops per destination.
A line plot showing average daily prices per destination.
Boxplot illustrating the spread and median of travel durations per city.
Flight prices significantly vary across destinations.
→ Some tech hubs (like Singapore or Tel Aviv) consistently have higher prices, suggesting pricing is influenced by distance, demand, or airline competition.
Flight prices also differ by number of stops.
→ Nonstop flights tend to be more expensive, while multi-stop flights are more affordable but may offer lower convenience.
There is a positive correlation between travel time and price.
→ Longer travel durations are moderately associated with higher prices, suggesting airlines adjust pricing based on route length.
Set different prices for each destination.
Some cities are much more expensive to fly to—adjust prices to match demand and travel costs.
Price flights based on number of stops.
Nonstop flights are more convenient and can be priced higher, while multi-stop flights can attract budget travelers.
Consider travel time in pricing.
Longer flights often cost more. Make sure the price reflects the duration, and offer better services on long routes.
There are clear differences in flight prices depending on where the flight goes, how many stops it has, and how long it takes. These insights show that pricing should not be the same for every route. By using this data, airlines and travel platforms can set smarter prices, meet traveler needs better, and increase profits.